130 research outputs found
Quantum random walks without walking
Quantum random walks have received much interest due to their non-intuitive
dynamics, which may hold the key to a new generation of quantum algorithms.
What remains a major challenge is a physical realization that is experimentally
viable and not limited to special connectivity criteria. We present a scheme
for walking on arbitrarily complex graphs, which can be realized using a
variety of quantum systems such as a BEC trapped inside an optical lattice.
This scheme is particularly elegant since the walker is not required to
physically step between the nodes; only flipping coins is sufficient.Comment: 12 manuscript pages, 3 figure
Lower bounds in differential privacy
This is a paper about private data analysis, in which a trusted curator
holding a confidential database responds to real vector-valued queries. A
common approach to ensuring privacy for the database elements is to add
appropriately generated random noise to the answers, releasing only these {\em
noisy} responses. In this paper, we investigate various lower bounds on the
noise required to maintain different kind of privacy guarantees.Comment: Corrected some minor errors and typos. To appear in Theory of
Cryptography Conference (TCC) 201
Sampling Triples from Restricted Networks Using MCMC Strategy
In large networks, the connected triples are useful for solving various tasks including link prediction, community detection, and spam filtering. Existing works in this direction concern mostly with the exact or approximate counting of connected triples that are closed (aka, triangles). Evidently, the task of triple sampling has not been explored in depth, although sampling is a more fundamental task than counting, and the former is useful for solving various other tasks, including counting. In recent years, some works on triple sampling have been proposed that are based on direct sampling, solely for the purpose of triangle count approximation. They sample only from a uniform distribution, and are not effective for sampling triples from an arbitrary user-defined distribution. In this work we present two indirect triple sampling methods that are based on Markov Chain Monte Carlo (MCMC) sampling strategy. Both of the above methods are highly efficient compared to a direct sampling-based method, specifically for the task of sampling from a non-uniform probability distribution. Another significant advantage of the proposed methods is that they can sample triples from networks that have restricted access, on which a direct sampling based method is simply not applicable
Quantified Derandomization of Linear Threshold Circuits
One of the prominent current challenges in complexity theory is the attempt
to prove lower bounds for , the class of constant-depth, polynomial-size
circuits with majority gates. Relying on the results of Williams (2013), an
appealing approach to prove such lower bounds is to construct a non-trivial
derandomization algorithm for . In this work we take a first step towards
the latter goal, by proving the first positive results regarding the
derandomization of circuits of depth .
Our first main result is a quantified derandomization algorithm for
circuits with a super-linear number of wires. Specifically, we construct an
algorithm that gets as input a circuit over input bits with
depth and wires, runs in almost-polynomial-time, and
distinguishes between the case that rejects at most inputs
and the case that accepts at most inputs. In fact, our
algorithm works even when the circuit is a linear threshold circuit, rather
than just a circuit (i.e., is a circuit with linear threshold gates,
which are stronger than majority gates).
Our second main result is that even a modest improvement of our quantified
derandomization algorithm would yield a non-trivial algorithm for standard
derandomization of all of , and would consequently imply that
. Specifically, if there exists a quantified
derandomization algorithm that gets as input a circuit with depth
and wires (rather than wires), runs in time at
most , and distinguishes between the case that rejects at
most inputs and the case that accepts at most
inputs, then there exists an algorithm with running time
for standard derandomization of .Comment: Changes in this revision: An additional result (a PRG for quantified
derandomization of depth-2 LTF circuits); rewrite of some of the exposition;
minor correction
Parallel Repetition of Entangled Games with Exponential Decay via the Superposed Information Cost
In a two-player game, two cooperating but non communicating players, Alice
and Bob, receive inputs taken from a probability distribution. Each of them
produces an output and they win the game if they satisfy some predicate on
their inputs/outputs. The entangled value of a game is the
maximum probability that Alice and Bob can win the game if they are allowed to
share an entangled state prior to receiving their inputs.
The -fold parallel repetition of consists of instances of
where the players receive all the inputs at the same time and produce all
the outputs at the same time. They win if they win each instance of .
In this paper we show that for any game such that , decreases exponentially in . First, for
any game on the uniform distribution, we show that , where and are the sizes of the input
and output sets. From this result, we show that for any entangled game ,
where is the input distribution of and
. This implies parallel
repetition with exponential decay as long as for
general games. To prove this parallel repetition, we introduce the concept of
\emph{Superposed Information Cost} for entangled games which is inspired from
the information cost used in communication complexity.Comment: In the first version of this paper we presented a different, stronger
Corollary 1 but due to an error in the proof we had to modify it in the
second version. This third version is a minor update. We correct some typos
and re-introduce a proof accidentally commented out in the second versio
FLEET: Butterfly Estimation from a Bipartite Graph Stream
We consider space-efficient single-pass estimation of the number of
butterflies, a fundamental bipartite graph motif, from a massive bipartite
graph stream where each edge represents a connection between entities in two
different partitions. We present a space lower bound for any streaming
algorithm that can estimate the number of butterflies accurately, as well as
FLEET, a suite of algorithms for accurately estimating the number of
butterflies in the graph stream. Estimates returned by the algorithms come with
provable guarantees on the approximation error, and experiments show good
tradeoffs between the space used and the accuracy of approximation. We also
present space-efficient algorithms for estimating the number of butterflies
within a sliding window of the most recent elements in the stream. While there
is a significant body of work on counting subgraphs such as triangles in a
unipartite graph stream, our work seems to be one of the few to tackle the case
of bipartite graph streams.Comment: This is the author's version of the work. It is posted here by
permission of ACM for your personal use. Not for redistribution. The
definitive version was published in Seyed-Vahid Sanei-Mehri, Yu Zhang, Ahmet
Erdem Sariyuce and Srikanta Tirthapura. "FLEET: Butterfly Estimation from a
Bipartite Graph Stream". The 28th ACM International Conference on Information
and Knowledge Managemen
Selectivity estimation on set containment search
© Springer Nature Switzerland AG 2019. In this paper, we study the problem of selectivity estimation on set containment search. Given a query record Q and a record dataset S, we aim to accurately and efficiently estimate the selectivity of set containment search of query Q over S. The problem has many important applications in commercial fields and scientific studies. To the best of our knowledge, this is the first work to study this important problem. We first extend existing distinct value estimating techniques to solve this problem and develop an inverted list and G-KMV sketch based approach IL-GKMV. We analyse that the performance of IL-GKMV degrades with the increase of vocabulary size. Motivated by limitations of existing techniques and the inherent challenges of the problem, we resort to developing effective and efficient sampling approaches and propose an ordered trie structure based sampling approach named OT-Sampling. OT-Sampling partitions records based on element frequency and occurrence patterns and is significantly more accurate compared with simple random sampling method and IL-GKMV. To further enhance performance, a divide-and-conquer based sampling approach, DC-Sampling, is presented with an inclusion/exclusion prefix to explore the pruning opportunities. We theoretically analyse the proposed techniques regarding various accuracy estimators. Our comprehensive experiments on 6 real datasets verify the effectiveness and efficiency of our proposed techniques
Sublinear Estimation of Weighted Matchings in Dynamic Data Streams
This paper presents an algorithm for estimating the weight of a maximum
weighted matching by augmenting any estimation routine for the size of an
unweighted matching. The algorithm is implementable in any streaming model
including dynamic graph streams. We also give the first constant estimation for
the maximum matching size in a dynamic graph stream for planar graphs (or any
graph with bounded arboricity) using space which also
extends to weighted matching. Using previous results by Kapralov, Khanna, and
Sudan (2014) we obtain a approximation for general graphs
using space in random order streams, respectively. In
addition, we give a space lower bound of for any
randomized algorithm estimating the size of a maximum matching up to a
factor for adversarial streams
Sparse recovery with partial support knowledge
14th International Workshop, APPROX 2011, and 15th International Workshop, RANDOM 2011, Princeton, NJ, USA, August 17-19, 2011. ProceedingsThe goal of sparse recovery is to recover the (approximately) best k-sparse approximation [ˆ over x] of an n-dimensional vector x from linear measurements Ax of x. We consider a variant of the problem which takes into account partial knowledge about the signal. In particular, we focus on the scenario where, after the measurements are taken, we are given a set S of size s that is supposed to contain most of the “large” coefficients of x. The goal is then to find [ˆ over x] such that [ ||x-[ˆ over x]|| [subscript p] ≤ C min ||x-x'||[subscript q]. [over] k-sparse x' [over] supp (x') [c over _] S]
We refer to this formulation as the sparse recovery with partial support knowledge problem ( SRPSK ). We show that SRPSK can be solved, up to an approximation factor of C = 1 + ε, using O( (k/ε) log(s/k)) measurements, for p = q = 2. Moreover, this bound is tight as long as s = O(εn / log(n/ε)). This completely resolves the asymptotic measurement complexity of the problem except for a very small range of the parameter s.
To the best of our knowledge, this is the first variant of (1 + ε)-approximate sparse recovery for which the asymptotic measurement complexity has been determined.Space and Naval Warfare Systems Center San Diego (U.S.) (Contract N66001-11-C-4092)David & Lucile Packard Foundation (Fellowship)Center for Massive Data Algorithmics (MADALGO)National Science Foundation (U.S.) (Grant CCF-0728645)National Science Foundation (U.S.) (Grant CCF-1065125
- …